METHOD AND APPARATUS TO PREVENT REVERSE 
ENGINEERING AND TAMPERING 



BACKGROUND OF THE INVENTION 

Field of the Invention 

The present invention relates to methods and apparatus that can prevent, resist, or 
deter reverse engineering and tampering with information such as computer software or data 
files during both the static and dynamic states of its presence on a system. 

Background of the Invention 

Most computer software and data found on commercially available general purpose 
operating systems are exposed to a threat of being reverse engineered or tampered with 
using widely available disassembling, de-compiling, debugging, and in-circuit emulating 
utilities. Despite the employment of cryptographic algorithms, hardware dangles, and 
software encryption, software and data remain vulnerable to security attacks. Such 
vulnerability exists regardless whether the software and data are present on a computer 
system in a static state (such as on a hard drive or other non- volatile storage media) or a 
dynamic state (such as in residence in a cache memory or main memory). 

For example, by taking advantage of the appropriate utilities, an attacker skilled in 
the art of computer security or security cracking can observe and re-assemble the 
instructions of a software program by tracing their execution image in memory. The 
attacker can further monitor and/or alter a software program's secret operations such as its 
interactions vsdth physical components of a computer. The attacker can also de-compile and 
analyze compiled code in the static state and then alter critical sections of the compiled code 
to compromise security. 

To increase the difficulty for an attacker to observe, understand, or modify source 
code, companies such as Intel and Intertrust have introduced elaborate schemes of 
transformation or slicing of source code. The potential pitfall of these protection schemes is 
that they rely on the ingenuity of their designers. Attackers on the other hand similarly rely 
on their ingenuity to reverse engineer a protector's design. Thus, the effectiveness of the 
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protection system becomes an ingenuity contest between the protector and the attacker. 
Unfortunately, this fails to provide a scientific measure of how easy or how difficult it is for 
the ingenious protection mechanisms to be broken. 

For these and related reasons, we assume that (1) all compiled code and data files 
are observable given the availability of commercial hardware and software utilities; (2) all 
elaborate schemes can be reverse engineered by ingenuity; and (3) attackers know the 
design of security schemes every bit as well as the designers. We believe the security of a 
protection system should be predictable and measurable. The most appropriate and reliable 
measure is probably the computation time and cost required to crack the protection system. 
In particular, true security lies in a predictable, large work factor for attackers. Such work 
factor should be large enough to make it humanly impossible to comprehend the protected 
source code and data files, and exponentially time consuming and expensive for computers 
to do so. 

SUMMARY OF THE INVENTION 

The present invention protects computer software by adding to the software large 
numbers of obscuring instructions selected fi:om an obscuring code bank. Preferably, the 
obscuring instructions selected ft-om the obscuring code bank are made to resemble the 
computer code that is obscured to achieve uniqueness of obscuration at each installation. 
Such obscuration can be achieved through embodiments both at the source code level and 
the object code level although different apparatus may need to be employed. 

Preferably, the obscuring instractions are generated in fimctional groupings called 
"blocks". Advantageously, at least some of the blocks of obscuring instructions are formed 
fi-om other blocks by a transformational relationship. Specifically, for any two successive 
blocks, Cj and C2, that have a tr£insformational relationship, the instructions in C2 can be 
determined and generated by performing a mathematical transformation T on number codes 
associated with the instructions in Cj. And, in general, any block is generated by a series 
of mathematical transformations T such that C^= T^(Tj^_^ (--(Ta (T2 (Ci)))...)) where Cj is an 
initial block of obscuring instructions selected fi-om the obscuring code bank. Conversely, 
instructions in C, can be determined and generated by performing the inverse mathematical 
transformation on number codes associated with the instructions in C2. Advantageously, 
different mathematical transformations are used between different pairs of successive blocks 
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and the transformations are randomly selected so as to achieve uniqueness for each set of 
obscuring instructions that is generated. 

To enhance security, the obscured object code is encrypted and stored in the form of 
superblocks of concatenated blocks of code. Consequently, each block of code in a 
superblock can be decrypted only if the blocks of code that precede it in the superblock have 
previously been decrypted. This results in an obscured package of code that is resistant to 
analysis of the statically stored code or to any tampering while in the static state. 

Advantageously, the obscured object code may also be compressed to remove some 
of the redundancy arising from the use of mathematical transformations to generate some of 
the obscuring code blocks. 

In the present invention, the obscured code package is loaded into a computer's real 
memory block by block. A run time apparatus is employed to decrypt/decompress and 
generate the obscured instructions of each block starting with C,. The generated 
instructions of each block are loaded into memory by this apparatus at a dynamically 
determined address that is imique for each block. After loading the block, the run time 
apparatus switches control to the instruction block for execution. When execution of each 
block is completed, control is switched back to the run time apparatus to load the next 
block. The process continues until instructions in all the blocks are executed. The dynamic 
loading and execution of each block makes it virtually impossible to trace instructions that 
are only generated and executed in real time. 

A preferred method of operating the invention to protect a sequence of computer 
code comprises the steps of: preparing simple obscuring instructions that are 
comprehensible yet require considerably more time to read and understand; injecting a large 
number of obscuring instructions into the sequence of computer code in an automated 
process to produce an obscured sequence of computer instructions that in total is humanly 
impossible to read and understand; compressing and/or encrypting a static image of the 
obscured sequence to protect against direct decompilation; and executing the obscured 
instructions one instruction at a time, thereby making run time trace and observation a labor 
intensive manual process. Preferably, the method provides a computational work factor that 
is exponential at least on the scale of where N is the number of obscured instructions and 
potentially may be as much as e^ For example, for 10,000,000 obscured instructions, it 
can be expected that it would take over 250,000 years on a modem PC (e.g., 500 MHZ 
clock rate) to locate and reverse engineer the protected sequence of computer code. 
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In similar fashion, the present invention also protects data files by adding to the data 
large numbers of obscuring data selected from an obscuring data bank. Blocks of obscuring 
data can also be generated that are related to each other by a mathematical transformation. 
The transformation can be performed either on the data itself or on number codes associated 
with each item of obscuring data. 

To enhance security, the obscured data may likewise be encrypted and/or 
compressed. This may be done as part of the same process that encrypts and/or compresses 
the obscured object code or it may be done separately. 

BRIEF DESCRIPTION OF THE DRAWINGS 

These and other objects, features and advantages of the invention will be more 
readily apparent fi-om the following detailed description in which: 

Fig. 1 describes a first part of the apparatus of the present invention. 

Fig. 2 describes a second part of the apparatus of the present invention. 

Fig. 3 describes a third part of the apparatus of the present invention. 

Fig. 4 describes a fourth part of the apparatus of the present invention. 

Fig. 5 illustrates the content of a data file used for run time decryption. 

Fig. 6 illustrates the structure of run time components of the system in reference to a 
computer's execution environment. 

Fig. 7 describes the process of loading and executing the first block of obscured code 
segments and the loading of data for the first N blocks. 

Fig. 8 describes the process of loading and executing the second obscured code 
block and transfer of control fi-om second block to the third block. 

DETAILED DESCRIPTION OF THE DRAWINGS 

In Fig. 1, a pre-processor 102 parses source code 101 to generate serialized code 
blocks 104 and a critical function profile 106. This process is completed on the source code 
level at pre-compile time. Source code 101 is typically the source code used in critical 
fiinctions, such as the most crucial part to the overall security of a computer program, or 
source code that contains the most essential implementation details in realizing certain 
valuable design and other trade secrets. The present invention is primarily to protect the 
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critical function source code from being identified, observed, traced for execution or 
modified (often referred to as "patched")- Source code 101 is considered serialized in the 
present invention when all subroutine calls in source code 101 have been fiiUy expanded 
into sequentially listed instructions in one self-contained body function/subroutine. By 
transforming source code 101 into the serialized code blocks 104, source code 101 is 
prepared for injecting obscuring code in later stage processes of the present invention. 

Pre-processor 102 uses a user defined security strength 103 as an input parameter in 
determining the number of lines of code to be generated in each code block 1 04 and the total 
number of code blocks 104. For the highest level of security strength 103, each code block 
104 will contain at most only one line of source code 101 or one instruction, and the total 
number of code blocks 1 04 is equal to or greater than the number of lines of source code in 
the critical function source code 101. For the lowest level of security strength 103, there 
could be only one code block 104 that may contain all the original source code 101 within 
the single block. For other levels of security strength 103, the pre-processor 102 may 
randomly set a number of code blocks, NC, and randomly determine the number of 
instructions, A7, in each code block 104. 

Pre-processor 102 also generates a data set called critical fimction profile 106 that 
describes the nature of the source code contained in the critical function source code. . 
Profile 106 is understandable to other components of the present invention and is utilized 
for selecting obscuring code that "looks" similar to make it difficult for an attacker to 
distinguish the original critical function source code 101 and the obscuring code that is 
injected later. 

As an example of obscuring a simple instruction in a function, consider a simple C 
program segment where the critical operation instruction is "VI = 1024": 

Functionl (....){ 

Int VI ; //declare the variable as integer type 
VI = 1024; //Assign value to VI 

Return VI; // retum the value of V 1 

} 
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The objective here is to obscure the assignment operation Vl=1024, which is simple 
and straight forward and will not take more than a few seconds for a person skilled at the art 
to understand. However, by simply injecting some obscuring code, the obscured code will 
be significantly more time consuming to read and comprehend. The obscuring code 
selected may include a number of assignment instructions and some simple calculation 
instructions. 

ObsFimctionl (...){ 
Int VI, V2, V3, V4; 



VI 




1024; 


VI 




VI + 1024; 


V2 




1024; 


V2 




V2 + 1024; 


V3 




V2; 


V3 




V3 - V2; 


V4 




1024; 


V4 




V4 + 1024; 


V4 




V4-V4 


V2 




V2-1024; 


VI 




VI -1024; 


V2 




V2- 1024; 


VI 




VI +V2 + V3+V4; 



Return VI; 
} 

With the added obscuring instructions, it is still possible to isolate VI out of V2, V3, 
and V4, However, now it will take a few minutes before a skilled person can read and 
identify the original critical instruction "VI =1024". The complexity of this example 
increases enormously when the original critical instructions are composed of 10 to 20 lines 
of instructions, these instructions are mixed with obscuring instructions on the order of 
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millions including both instructions that are similar to the instructions to be protected and 
instructions that are dissimilar, and the critical instructions are randomly spread across many 
blocks. In fact, it becomes so time consuming to read and comprehend the obscured output 
that it will become humanly impossible. For example, consider the difficulties involved in 
interpreting a sequence of code if, instead of four values VI, V2, V3 and V4, the code 
included 100,000 values, returned all 100,000 values and there was no indication of which 
value or values had any significance. 

Fig. 2 depicts an obscuring code generator 203 and two predefined code banks: an 
obscuring code bank 204 and a transformation function bank 205. In Fig. 2, obscuring code 
generator 203 generates obscuring code blocks 206 that are used to protect the critical 
function source code 101. Illustratively, the available storage size is one million lines of 
code, the size of each block 206 is 125 lines of code, and there are 8000 blocks of code. 
Obscuring code generator 203 uses the critical function profile 106, security strength 103, 
storage size 201, and execution time 202 as input parameters for generating obscuring code 
from the two predefined code banks. 

Obscuring code bank 204 is a database that contains program instructions previously 
created through a manual, automated, or a combination of manual and automated process. 
Associated with each instruction is a imique numeric code. Thus, each numeric code 
identifies an obscuring instruction and the entire set of numeric codes identifies the entire 
set of obscuring instructions. Advantageously, each numeric code may simply be the 
memory address at which the instruction is stored in the obscuring code bank. Each code 
block 206 is a subset of the obscuring instructions available in obscuring code bank 204 and 
the instructions in block 206 can be identified by the numeric codes associated with those 
instructions. 

The program instructions in bank 204 comprise a large pool of instructions that are 
often built upon expertise and experience of the database designers. A large number of 
them resemble the most frequently used instructions in commonly used programming 
languages although a significant portion of them are purely random code without predefined 
profile. As indicated in the example given above, obscuring code may perform a function 
(e.g., returning the value VI = 1024) that is useful to the critical function code to be 
protected but do so in a way that is very inefficient. Indeed, it may be spectacularly 
inefficient. Altematively, as suggested by operations such V4 = V4 - V4, the code may be 
functional but may do nothing more than perform an operation and later perform the inverse 
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of the operation so as to produce no effect other than obscuring the code to be protected. 
Other examples of obscuring code may have nothing do with the operation of the code to be 
protected but will still have to be deciphered because an attacker v^ll not know which code 
is relevant and which is not. 

The presence of the obscuring instructions injected into the critical function source 
code has a direct impact on any attacker's ability to understand or modify the correct 
instructions in order to compromise the system's integrity. As an example, a segment of 10 
lines of instructions that performs certain essential functions of a software application is 
serialized, distributed into multiple code blocks and mixed with 1,000,000 lines of 
obscuring code instructions. Using modem day microcomputers, one can assume that the 
typical CPU clock speed is beyond 500 MHZ. At such speeds, it takes a microcomputer no 
more than 8 milliseconds to execute all 1,000,000+ instruction provided the instructions 
perform relatively simply calculations, assignments and minimal I/Os. While the 
computational overhead is relatively low, the job for an attacker to understand and modify 
the key parts of these instructions is an insurmoimtable task. Even though the task of 
obtaining and observing the instructions is already difficult, let's assume an attacker can 
capture all 1 ,000,000 plus instructions and can observe and analyze them. This number of 
instructions amounts to 1 6,667 pages of printout on regular letter sized paper. Assuming 
the attacker can read at the speed of 3 minutes per page and work for an average of 8 hours a 
day, it wall take him over 100 days just to finish reading the content in order to reach a 
shallow imderstanding of the instructions. As a practical matter, careful examination and 
much more time will ordinarily be required to identify the original 10 lines of critical 
instruction out of the 1,000,000 obscuring instructions. 

Furthermore, if a certain set of instructions is of even higher importance in a 
software application, more obscuring instructions can be injected. Consider the example of 
injecting 10,000,000 obscuring instructions. A modem computer can process these 
instructions within 80 milliseconds. However, the attacker will be challenged with a total of 
1 66,667 pages of printout, and over three years just to finish reading them casually. The 
challenge is practically equivalent to finding a small needle in the Atlantic Ocean, which is 
humanly impossible to do. 

Current and future advances in computer microprocessor are rapidly accelerating. 
Today, CPUs that work at one GHz have been announced and CPUs that work at over 500 
MHZ are commonplace. The faster a CPU can process instructions, the more obscuring 
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instructions can be injected to protect critical functions of applications, and consequently, 
the harder it becomes for attackers to understand, identify or modify the protected 
instructions. 

Additionally, there are no currently known pattern recognition algorithms that can 
automatically parse, understand and locate critical instructions found in a large number of 
obscured instructions. Due to the largely random nature of the obscuring instructions, the 
pattem recognition task can be highly difficult. By adding more elaborate transformations 
and slicing of original code in combination with the large number of injected obscuring 
code, development of a pattem recognition algorithm can be made even more difficult. It is 
reasonable to assume the computational complexity of such an algorithm is at least as high 
as 0(N^), where N is the number of instructions, with the possibility of being even as high 
as 0(e^*°^^). In the case of 10,000,000 ore more instructions, using the 0(N^) estimate, one 
can expect the computation can take as long as 250,000 years on a 500MHz CPU modem 
day personal computer. 

On the other end of the scale, it is evident that substantial protection can be achieved 
using far fewer obscuring instructions than 1,000,000. Even 10,000 lines of obscuring 
instructions represent a day's effort to read and typically much more time to understand. 
How much more time is a function of the intricacy of the code. As a practical matter with 
appropriate obscuring instructions, we believe it is reasonable to assume that it would take 
several months' effort to reach sufficient understanding of 1 0,000 such instructions to be 
able to identify and understand the operation of critical instructions embedded in such 
obscuring instructions. In some applications, several months' time is enough protection. As 
will be apparent, greater amoimts of protection can be achieved with increasing numbers of 
lines of obscuring instructions. With 100,000 lines of obscuring instructions, we estimate 
the amount of time for one individual to reach an understanding of the operation of the 
critical instructions to be several years, which is often the length of time that a software 
product enjoys commercial success. In such circumstances, 100,000 lines of obscuring 
instructions may be enough protection. 

The transformation function bank 205 is a database previously created to contain 
mathematical functions that are one-to-one mappings from Set A to Set B and their inverse 
functions that are one-to-one mappings from Set B to Set A. Associated with each 
transformation is a imique numeric code. Thus, each numeric code identifies a 
transformation and the entire set of numeric codes identifies the entire set of 
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transformations. Advantageously, each numeric code may simply be the memory address at 
which the instruction is stored in the transformation code bank. Preferably, Sets A and B 
are sets of numeric codes and the transformation T satisfies the following relationships: 

B = T (A) 

and 

A = T (B) 

where T' is the mathematical inverse function of T. Examples of T might be increment the 
value of A by 10 or multiply the value of A by 3; and the corresponding inverse functions 
would be decrement the value of B by 10 and divide the value of B by 3. 

Obscuring code generator 203 applies the transformations obtained from the 
transformation function bank 205 to the numeric codes associated with the obscuring 
instructions obtained fi-om the obscuring code bank 204 to produce more obscuring 
instructions. In particular, generator 203 produces blocks of obscuring code 206. The first 
of these blocks is generated by generator 203 by selecting obscuring instructions from code 
bank 204. Additional blocks are generated by selecting transformations 210 from 
transformation function bank 205, applying these transformations to the numeric codes 
associated with the obscuring instructions found in a previously generated block of 
obscuring instructions so as to generate a set of transformed numeric codes and forming 
new blocks 206 of obscuring instructions using the instructions identified by the 
transformed numeric codes. Preferably, the transformations are selected randomly. The 
selected transformation functions are represented by elements 208 in Fig. 2 and their 
inverses by elements 211. 

In the event a transformation generates a numeric code that is outside the range of 
numeric codes, the generated numeric code "wraps around" as in modulus arithmetic so as 
to generate a numeric code that is within range. 

"'^^■Siijhe embodiment of the invention shovra in Fig. 2, the transformations are 
concatenated so mat^mQi^ock is generated by a series of mathematical transformations 
T such that = T^ (T^.i (... (T^^tTj^TCi)))...)), where C, is an initial block of obscuring 
instructions selected fi-om the obscuring co3fe"bank. By concatenating the transformations, 
it is possible to generate an enormous number of differSntljarisformations while storing 
only relatively few transformations in the transformation functiorTbaiik 205. Altematively, 



- 10- 



NY2- 1074667.1 



each block can begeiTefatecLfto!2^^ fi^st block of obscuring instructions using a single 
transformation function instead of the concatenSted-seMiffimctions. 

The use of transformations to generate additional blocks of obscuring instructions 
makes it possible to generate enormous numbers of additional obscuring instructions while 
allowing the system to compress and encrypt these instructions. To someone trying to 
understand the instructions, a block of instructions generated by a transformation of 
associated numeric codes can be every bit as difficult to understand as the original block of 
instructions. However, the transformed block can be represented simply by the 
transformation which can be represented by its numeric address in the transformation code 
bank. Thus, while it would require 125 numeric codes associated with instructions in code 
bank 204 to represent a first block of 125 obscuring instructions, a single numeric code 
associated with a transformation in function bank 205 can be used to generate from the first 
code block another 125 numeric codes associated with instructions in code bank 204 to 
represent a second block of 125 obscuring instructions and so on for additional blocks of 
obscuring instructions. Moreover, if the correspondences between the numeric codes and 
the obscuring instructions and the numeric codes and the transformation functions can be 
kept secret, the instructions may also be encrypted. 

It should be noted, however, that there are also computational costs involved in 
generating the additional blocks of code using the transformations. As a result, a typical 
practice is for generator 203 to produce several different blocks of obscuring code by 
selecting instructions form code bank 204 and then generate from each of these blocks of 
obscuring code several additional blocks of obscuring instructions by selecting 
transformations 210 from the transformation function bank 205. 

The composition of the first code block, the number of code blocks, the size of each 
code block, the number of obscuring instructions per line of code to be obscured, and the 
compression ratio to be maintained are determined by generator 203 from the critical 
function profile 106, security strength 103, storage size 201 and execution time 202. 

Fig. 3 depicts an obscuring code injector 301, run time apparatus 302 and an 
obscuration compiler 308. In Fig. 3, obscuring code injector 301 combines the serialized 
code blocks 104 and obscuring code blocks 206 with run time apparatus 302 to create a pre- 
compilation obscured image 307. Obscuration compiler 308 uses the pre-compilation 
program image 307 as input to create an obscured object level image 312. 
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Run time apparatus 302 comprises the necessary programming instructions to load 
blocks of machine level code into a computer's memory for execution and to transfer 
execution control from one code block to another. A code locator 303 locates one or more 
blocks of programming instruction from a data file that will be described later in Fig. 6. A 
decry ptor 304 decrypts the code block located by the code locator 303 into plain text 
machine level code in preparation for execution. A code loader 305 loads the decrypted 
code block into memory and starts the actual execution of the instructions. A control 
handler 306 hands over control of execution from the current code block to the next one in 
queue as soon as the current code block's execution is finished. 

\Qbscuring code injector 301 injects the run time apparatus 302 comprising elements 
303, 304, 305>^d 306 into the serialized critical function source code blocks 104 and the 
obscuring code blocfe^^06 to minimize the possibility for the serialized critical function 
source code to be observed.^^s^esult, image 307 comprises multiple collections of blocks 
302, 104, and 207. At this stage, the pi-e^mpilation obscured image 307 is ready to be 
compiled into object code. The obscuration cotnptler 308 is applied to pre-compilation 
obscured image 307 to create object level code blocks 309^^0, and 31 1 in correspondence 
to blocks 302, 104, and 206. Each collections of a block 309, biocfc^aiO, and block 3 1 1 is 
referred to as an object level block Oj 312. 

Obscuration compiler 308 is a special purpose apparatus that augments a regular 
compiler by preserving the transformation constraints. With a regular compiler, the 
transformation function T for adjacent obscuring code blocks 206 would be lost once the 
source code is compiled into object level code using a regular compiler. However, 
obscuration compiler 308 implements the processing logic to preserve such transformation 
fimction even after object level code is created for the source code. Specifically, if the 
fimctional constraint exists between block Cj 206 and block C2 206 and can be defined as 
follows (same as in Fig. 2): 

C, = T, (CO 

and 

C,=T2' (Q) 
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where and Tj* are transformation functions and inverse transformation functions for Cj 
and C2, then, the obscuration compiler ensures that the corresponding object level code 
blocks Oci 311 and O^j 311 satisfy the following constraints: 

Oc2 = T2 (Oc) 

and 

Oci = T; (Oc2). 

TTle4fflplgmentation of a compiler that preserves the transportation information in 
this way will be knowntoThese.skilled in the art. By so preserving the transformation 
information, the transformation flmctionslran43ie..applied to the object level code to achieve 
compression, if desired. """""""■"^""""-"^^ 

In Fig. 4, an encryption processor 401 takes object level blocks 312 as input, and 
encrypts them in a recursive chain fashion. The encryption process is applied to all object 
level code blocks 312 starting with which includes blocks O^n 309, 0^^ 310, and O^n 
3 1 1 and ending with O2 which includes blocks 309, 310, and 0^2 311. The process 
is not applied to object code block Oj which includes blocks Oli, Oei, and 0^. The output 
of each stage i of encryption processor 401 is Dj 402. The output of each stage except stage 
2 is applied as an input to the encryption processor of the next stage. 

In general, each encryption processor Pj scrambles and thereby encrypts object level 
code block Oj 312 and the output of Dj^, of the previous processor in accordance with an 
algorithm specified by a key. Advantageously, a different scrambling algorithm is used for 
each encryption processor and the key that specifies the algorithm is inserted in clear text 
in the output Dj This encryption process ensures that the output data file is encrypted and 
can not be directly de-compiled statically. Because the blocks are encrypted as they are 
compressed, attackers cannot directly decompile the data files to obtain the entire obscuring 
and critical function instructions. These characteristics force attackers to trace the execution 
of the system in this invention during run time as the only feasible means to observe the 
obscuring instructions. 

Fig. 5 illustrates the data file constructed at the end of the obscuring process. The 
final data file contains essentially Oli309, Oei310, 0^ 311 and D2. D2 in turn contains the 
scrambled form of D3, and D3 contains D4, so on and so forth. Code blocks obscured in this 
fashion are protected against any direct de-compilation or disassembling attempts, because 
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the contents of the data file are no longer recognizable for utilities that do not understand the 
specific format and de-compression process. 

Because the number of obscuring instructions can be in the millions, it is also 
desirable to incorporate data compression technology in the encryption processors 401. In 
the case where blocks of obscuring code are generated by mathematical transformations 
fi*om a first block of obscuring code generated fi-om obscuring code bank 204, substantial 
compression can be achieved simply by representing each block in terms of the first block 
of obscuring code and the numerical codes representing the transformations used to generate 
the block. Since obstruction compiler 308 ensures that the object level code that represents 
the obscuring instructions is related by just such a series of transformations, such 
compression is achievable by replacing the object level code blocks O^i 31 Iwith the 
transformations that are used to generate these blocks. In addition, the numeric codes that 
identify these transformations can readily be scrambled and thereby encrypted at the same 
time as the encryption processor scrambles and encrypts object level code blocks 309 
and OHi310. 

In particular, in a preferred embodiment of the invention that both encrypts and 
compresses the object level code, encryption processor 401 scrambles Oln 309 and O^^ 
310 and the numeric code representing transformation in accordance with an algorithm 
specified by a key These scrambled values and the clear text value of the key ^ constitute 
output Dj^ 402. Compression is achieved by representing Ocn 31 1 in terms of the scrambled 
numeric code representing transformation T^. 

Subsequently, 402, a numeric code representing transformation T^.,, a key ^.i and 
the next set of object level code blocks O^.i are used as inputs for encryption processor P^.i 
401 . At this step, encryption processor Pj^^.i 401 scrambles Ol^-i 309, O^.^ 310 and 402, 
and a numeric code representing transformation T^.i- These scrambled values and the clear 
text value of key ^.i constitute output Djg.i 402. Again, compression is achieved by 
representing Oq^.i 31 1 in terms of the scrambled numeric code representing transformation 
Tn-1- 

The compression and scrambling process continues for all the object level code 
blocks in 312 in the same fashion for sequence number N-2, N-3, N-4... except for Olj 309, 
Oe, 310, and Oc, 311. 

Once the process is complete, 0^ 309 can retrieve key2 so as to de-scramble D2 402 
at runtime to retrieve Tj, 0^2 309, 0^2 311 and D3; and 311 can be recreated using 0^ 
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3 1 1 and the unscrambled numeric code for transformation Tj. However, until is 
descrambled and is executed, D3 remains undistinguishable. And similarly, until D3 is 
descrambled and Ol3 is executed, D4 remains undistinguishable; and so on. Such constructs 
ensure that all the code blocks are only observable when the scrambled data blocks Oli, D2, 
D3,..., Dj^., and are scrambled and executed at runtime. 

Fig. 6 depicts a typical microcomputer system architecture and the execution model 
of the run time apparatus of the present invention. A data file 602 is stored on the 
computer's hard disk 601 . The data file is loaded into the computer's real memory 603 at 
run time through the computer's main bus system 604. The actual memory space required 
to execute the code blocks contained in the data file is allocated separately and is illustrated 
at 605. 

Fig. 7 illustrates how the run time execution process starts with the first set of code 
blocks being loaded into memory and executed. 

executions of the code blocks are conducted within the memory address space 
indicated as 6t>5v^s the first step, Oli 701 is loaded at memory address Lj 711. This 
address will remain mit^:i^ged for all fiiture 0^2,. • .^Oln-i? Oln code blocks. CodeLoader 
702 of Oli executes within tm^^^jace to allocate a dynamic memory location at address E, 
710. It is important that address Ej Tll&'ti^ynamically assigned to ensure that the execution 
process of the code blocks can not be automatiiraljy traced at a fixed address using 
conventional or commercially available tools. Becaus§^€i^^e dynamic nature of this 
address allocation, address E2for the next set of code blocks C);>Qan not be determined 
until the active instructions of O^ de-scramble Ol2- ^"^-^ 

At address E,, the runtime image of a series of code blocks is loaded and executed, 
including Og, 704, Oci 705, Get(key2) 706, Decrypt (Ol2) 707, Load (Ols) 708, and "Jump 
To Address L" 709. O^, 704 and O^ 705 are the mixed instruction blocks that contain both 
the original instructions in critical function source code 101 and the obscuring instructions 
for the first code block. The execution of instruction blocks 704 and 705 is the most 
essential action at this stage. 

Get(key2) 706 is the instructions that retrieves the encryption key2 so that Ol2 can be 
decrypted and loaded into Address L. Decrypt(OL2) 707 is the set of instructions that 
actually decrypts and creates the executable Ol2 code blocks. "Load (Ol2) at Address L" 
708 loads the decrypted Ol2 instructions into the static memory address L 71 1 ready for the 
next step of processing. "Jump to L" 709 hands the control of execution to the instructions 
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loaded at address L 71 1 . At this stage, the essential functionality has been completed for 
step one and the system is ready to load and execute the next set of code blocks 309, O^j 
310, and Oc2 311. 

In Fig. 8, 801 has been decrypted and loaded into the static memory address L 
711. A code locator 802 locates the compressed and scrambled code for 807 and 
808 from the .DAT data file cache 606 and retrieves them for decryption and 
decompression. Decryptor 803 executes the actual decryption and decompression of the 
retrieved code blocks. A code loader 804 dynamically determines a memory address 806, 
allocates the necessary space, and loads the decrypted code blocks into it for execution. A 
control handler 805 transfers control of execution to the instructions loaded at 806. 

807 and 0^2 808 contain the true instructions within the original critical function 
source code 101 and the obscuring code blocks. They are first executed as the most 
essential functionality of this step. Subsequent code segments 809, 810, and 81 1 are similar 
to the apparatus described in Fig. 7, and are executed to retrieve Ol3, load into the static 
memory space at address L, then transfer execution control over to the instructions in that 
memory space. 

The execution of subsequent sets of code blocks, 309, 0^4 310, 0^4 311, Ols 
309, 310, 311, . . . , follow^ the same process as described above until all code blocks 
are loaded in memory and executed. 

^v^ecause the run time apparatus in this invention allows dynamic loading and 
execution (n'tii^locks in data file, virtually an arbitrary number of obscuring instructions 
can executed as long^a^execution overhead limit permits. Furthermore, because every 
block of instruction is execuledat a dynamically assigned memory address, it makes tracing 
execution of these blocks a challenghigtask. Without highly specialized hardware devices, 
locating the address where a block of instructions is loaded in memory is virtually 
impossible. These characteristics of the runtime sy^^m ensure that obtaining and observing 
instructions in memory using tracing techniques are labonbii^^d time consuming to the 
extent of being humanly impossible without the support of highlyfexjjensive and specially 
designed hardware devices. ^^^^^^^^^ 

The method and system described in the present invention can be applied to any 
digital material that includes an executable component. Whenever a software application 
includes implementation of highly valuable technology or other trade secrets, respective 
programming instructions can take advantage of the obscuring capability of the current 
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invention. For computer security related products, the present invention offers these 
products robust anti-trace and anti-decompilation protection for programming instructions 
that are most vulnerable and critical in the products. 

Additionally, an embodiment of the present invention can enable different sets of 
obscuring programming instructions to be injected for each different protected product, user 
desktop computer, or user identification. High performance back end server systems can be 
optimized to extract obscuring instructions from the obscuring code bank specifically and 
differently according to the input of a machine id, user id, or other uniquely identifying 
parameters. Such capability to prevent any potential compromise of security can be 
generically applied to other products, customers, or machines. It can ensure the same 
amount of computational resource is required to crack each product or machine across a 
product line or customer line. 

As indicated, in the same fashion as the invention is used to protect critical software, 
the invention may also be applied to the protection of critical data by hiding the critical data 
amid vast quantities of obscuring data generated from an obscuring data bank. Additional 
quantities of obscuring data may also be generated by transforming the obscuring data using 
a transformation function bank The obscured data may likewise be encrypted and/or 
compressed either as part of the same process that encrypts and/or compresses the software 
or independently thereof. 

In conclusion, the present invention makes it possible to protect critical 
programming instructions and/or data by injecting a large amount of obscuring instructions 
and/or data to the extent that observing and understanding of the obscured instructions 
and/or data is not humanly feasible. The apparatus and system of the present invention 
facilitates compression of obscuring instruction and/or data and the runtime execution of the 
obscuring instructions and/or data so that neither direct de-compilation nor real time tracing 
of the obscuring instructions and/or data can be achieved v^thout the use of extensive and 
expensive computing resources only affordable by large organizations over an extraordinary 
time span. 
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